Introduction

For this coding exercise, you will use OpenAI Gym's Taxi-v2 environment to design an algorithm to teach a taxi agent to navigate a small gridworld. The goal is to adapt all that you've learned in the previous lessons to solve a new environment!

Before proceeding, read the description of the environment in subsection 3.1 of this paper .

You can verify that the description in the paper matches the OpenAI Gym environment by peeking at the code here .

Answer the quiz questions below to check your understanding of the environment.

Question 1

SOLUTION:

There are 500 possible states, corresponding to 25 possible grid locations, 5 locations for the passenger, and 4 destinations.

How many actions are available to the agent?

There are 4 possible actions, corresponding to moving North, East, South, or West.

There are 6 possible actions, corresponding to moving North, East, South, or West, picking up the passenger, and dropping off the passenger.

There are 4 possible actions, corresponding to increasing or decreasing the speed of the taxi, dropping off the passenger, and picking up the passenger.

SOLUTION:

There are 6 possible actions, corresponding to moving North, East, South, or West, picking up the passenger, and dropping off the passenger.